Numerical Bayesian state assignment for a three-level quantum system 
I. Absolute-frequency data; constant and Gaussian-like priors 
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(Dated: 29 April 2007) 

This paper offers examples of concrete numerical applications of Bayesian quantum-state-assignment meth- 
ods to a three-level quantum system. The statistical operator assigned on the evidence of various measurement 
data and kinds of prior knowledge is computed partly analytically, partly through numerical integration (in eight 
dimensions) on a computer. The measurement data consist in absolute frequencies of the outcomes of N iden- 
tical von Neumann projective measurements performed on N identically prepared three-level systems. Various 
small values of N as well as the large-Af limit are considered. Two kinds of prior knowledge are used: one repre- 
sented by a plausibility distribution constant in respect of the convex structure of the set of statistical operators; 
the other represented by a Gaussian-like distribution centred on a pure statistical operator, and thus reflecting a 
situation in which one has useful prior knowledge about the likely preparation of the system. 

In a companion paper the case of measurement data consisting in average values, and an additional prior 
studied by Slater, are considered. 

PACS numbers: 03.67.-a,02.50.Cw,02.50.Tt,05.30.-d,02.60.-x 



1. INTRODUCTION 
1.1. Quantum-state assignment: theory. . . 

A number of different "quantum-state assignment" (or "re- 
construction", "estimation", "retrodiction") techniques have 
been studied in the literature. Their purpose is to encode var- 
ious kinds of measurement data and prior knowledge, espe- 
cially in cases in which the former is meager, into a statistical 
operator (or "density matrix") suitable for deriving the plau- 
sibilities of future or past measurement outcomes. The use 
of probabilistic methods is clearly essential in this task, 1 and 
they are implemented in a variety of ways. There are imple- 
mentations based on maximum-relative-entropy methods 2 and 
others based on more general Bayesian methods || [l0| O, 
|l2| , pj| , p"4| |i~5| l . Here we are concerned with the latter, which 
can apparently be used with a larger variety of prior knowl- 
edge than the former. 3 (Old statistical methods, like maxi- 
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S[j|]) usually refers to the spe- 
number of measurements and 
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^Electronic address: ,„„. 

1 "Quantum-state tomography" (cf. e.g. 
cial case in which, roughly speaking, 
measurement outcomes are sufficient to yield a unique statistical opera- 
tor. Mathematically, we have a well-posed inverse problem that does not 
require plausible reasoning. This case is only achieved as the number of 
outcomes gets larger and larger. 

2 The literature on these is so vast as to render anv small sample very unfair. 
Early and latest contributions are iHMR Q R pi. 

3 E.g., for a spin- 1/2 system, knowledge that "the state that holds is ei- 
ther the one represented by (the statistical operator) |z + )(z + | or the one 
represented by |z~)(z~|", is different from knowledge that "the state that 
holds is either the one represented by |a: + )(jc + | or the one represented by 
\x~)(x~\", and this difference can be usefully exploited in some situations: 
Make a measurement corresponding to the positive-operator-valued mea- 
sure {|z + >(z + |, |z~)(z~|}, and suppose you obtain the 'z + ' result. Conditional 
on the first kind of prior knowledge you then know that "the original state 
was the one represented by |z + )(z + l", whereas conditional on the second 
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mum likelihood, are not considered here either since they are 
only special cases of the Bayesian ones.) 

The fundamental ideas behind the Bayesian techniques 
were developed gradually. A sample of more or less related 
studies could consist in the works by Segal [ |l6||, Helstrom [|l7| 
HI [0| , Band and Park [fcoj &j || |f 
Holevo [§§§, Bloore Ivanovic 
Larson and Dukes A Jones @ ||], Malley and Horn- 
stei n [f^ Slater [Rolfi]], and many others MiJLi PI P 

; some central points can already be found 
in Bloch [§§. Such a dull list unfortunately does not do jus- 
tice to the relative importance of the individual contributions 
(some of which are just rediscoveries of earlier ones); those 
by Helstrom, Holevo, Larson and Dukes, and Jones, however, 
deserve special mention. 

All Bayesian quantum-state assignment techniques more or 
less agree in the expression used to calculate the statistical 
operator p DAl encoding the measurement data D and the prior 
knowledge /. The 'conditions', or 'states', in which the sys- 
tem can be prepared are represented by statistical operators 
p, whose set we denote by S. Let the prior knowledge / 
about the possible state in which the system is prepared be 
expressed by a 'prior' plausibility distribution p(p\I)dp = 
g(p)dp (where dp is a volume element on § or a subset 
thereof, and g a plausibility density; more technical details 
are given in § |J). Let the measurement data D consist in a 
set of N outcomes ii, ... ,ik, ■■■ ,in of N measurements, repre- 
sented by the N positive-operator- valued measures { E ^ : /i = 



you know now just as much as before. But in quantum maximum-entropy 
methods both kinds of prior knowledge are encoded in the same way, viz. as 
the same "completely mixed" statistical operator to be used with the quan- 
tum relative entropy; these methods thus provide less predictive power in 
this example. 
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1, . . . , r/c}, k - 1, . . .,N. Bayesian quantum-state assignment 
techniques yield a 'posterior' plausibility distribution of the 
form 



p(p\DM)dp 



p(D\ p)p(p\I)dp 



J p(D\ p)p(p\I)dp' 
[UMEfp)]g(p)dp (la) 
~ /[n*tr(Efp)]g(p)dp' 

and a statistical operator p DAl given by a sort of weighted 
average, 4 



Pdm : = pp(p\DM)dp = 



J p[rLtr(Efp)]g(p)dp 
j[U k tr(Efp)]g(p)dp 



(lb) 

These formulae may present differences of detail from author 
to author, reflecting — quite excitingly! — different philo- 
sophical stands. For instance, the prior distribution (and there- 
fore the integration) is in general defined over the whole set 
of statistical operators; but a person who conceives only pure 
statistical operators as representing sort of "real, internal (mi- 
croscopic) states of the system" may restrict it to those only. 
A person who, on the other hand, thinks of the statistical op- 
erators themselves as encoding "states of knowledge" au pair 
with plausibility distributions, might see the prior distribution 
as a "plausibility of a plausibility", and thus prefer to derive 
the formula above through a quantum analogue of de Finetti's 
theorem; in this case the derivation will involve a tensor prod- 
uct p ® ■ ■ ■ ® p of multiple copies of the same statistical oper- 
ator. 5 

The formulae ([[]) (or special cases thereof) are proposed 
and used in Larson and Dukes Jones p7| , g8|], and e.g. 
Slater [[HJ], Derka, Buzek, et al. [| ^ g], and Mana [jffj. 
We arrived at these same formulae (as special cases of formu- 
lae applicable to generic, not necessarily quantum- theoretical 
systems) in a series of papers [ |58[ ^9, 70, 7l|] (see also [ p7[ 
f72[|) in which we studied and tried to solve the various philo- 
sophical issues to our satisfaction. 



1.2. . . . and practice 



those in [|73| [74] [75| [76]]) convex region of d 2 - 1 dimensions 
(2d - 2 dimensions if only pure statistical operators are con- 
sidered), and one has to choose between explicit integration 
limits but very complex integrands, or, vice versa, simpler in- 
tegrands but implicitly defined integration regions. 

Therefore explicit calculations have hitherto been confined 
almost exclusively to two-level systems, which have the ob- 
vious advantages of low-dimensionality and symmetry (the 
set of statistical operators is the three-dimensional Bloch 
ball [[77j 78 1); in some cases these allow the derivation of 
analytical results. 6 In studies by Jones p7[], Larson and 
Dukes iQ, Slat er J4l| , [79| (these are very interesting stud- 
ies; cf. also |2pX and Buzek, Derka, et al. [g g, g], the 
posterior distributions p(p\D)dp and the ensuing statistical 
operator p DA , are explicitly calculated for measurement data 
D and priors / of various kinds. In some of these studies the 
integrations range over the whole set of statistical operators, 
in others over the pure ones only. As regards higher-level sys- 
tems, the only numerical study known to us is that by Buzek 
et al. |54j] for a spin-3/2 system; however, they assume from 
the start that the a priori possible statistical operators are con- 
fined to a three-dimensional subset of the pure ones; this as- 
sumption simplifies the integration problem from 15 to 3 di- 
mensions. 



1.3. More practice: the place of the present study 



In this paper and its companion [ |83[ | we provide numeri- 
cal examples of quantum-state assignment, via eqs. ([[]), for a 
three-level system. The set of statistical operators of such a 
system, S3, is eight-dimensional — a high but still computa- 
tionally tractable number of dimensions — and has less sym- 
metries, in respect of its dimensionality, than that of a two- 
level one: a two-level system is a ball in R 3 , but a three-level 
one is definitely not a ball in R 8 . 7 Some three-dimensional 
sections of this eight-dimensional set are given in fig. [j] (two- 
dimensional sections can be found in UM\: four-dimensional 



ones are also available [76|); see also Bloore's very interesting 
study (3T|. 

We study data D and prior knowledge / of the following 
kind: 



In regard to the numerical computation of formulae (yj) in 
actual or Active state-assignment problems, with explicitly 
given prior distributions and measurement data, the number 
of studies is much smaller. The main problem is that for- 



mula (lb), when applied to a af-level system, generally in- 
volves an integration over a complicated (see e.g. figs, [l] and 



4 Note that, as shown in § ^ the derivation of the formula for Poa; does not 
require decision-theoretical concepts. 

5 We leave to the reader the entertaining task of identifying these various 
philosophical stances in the references already provided. 



6 The high symmetry, however, renders the results independent of the par- 
ticular choice of prior knowledge in some cases, e.g. when the prior is 
spherically symmetric and the data consist on averages. 

7 In group-theoretical terms, the "quantum" symmetries of the set of statisti- 
cal operators of a three-level system are fewer than those it could have had 
as a eight-dimensional compact convex set. The former symmetries are in 
fact equivalent to the group U(3)/U(l) IM^, H § 4], of 8 di- 
mensions, whereas the latter could have been as large the group SO(8), of 
28 dimensions Jj^, BQ, |9l|, M ]. Compare with the case of a two-level sys- 
tem, whose symmetry groupU(2)/U(l), of 3 dimensions, is isomorphic to 
the largest symmetry group that a three-dimensional compact convex body 
can have, SO(3). (We have only considered the connected part of these 
groups; one should also take the semidirect product with Z2.) 




Figure 1: Some three-dimensional sections of the eight-dimensional set S3 of the statistical operators for a three-level quantum system, 
adopted coordinate system is explained in § pi 



The 
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The measurement data D consist in a set of N outcomes 
of N instances of the same measurement performed 
on N identically prepared systems. The measure- 
ment is represented by the extreme positive-operator- 
valued measure (i.e., non-degenerate 'von Neumann 
measurement') having three possible distinct out- 
comes {'1', '2', '3'} represented by the eigenprojectors 
1 1 1 >< 1 1 , |2><2|, |3)(3|}. The data D thus correspond to a 
triple of absolute frequencies (A 7 !, A^A^) =: N, with 
Ni > and 2, Nj = N. We consider various such triples 
for small values of N, as well as for the limiting case of 
very large N. 

Two different kinds of prior knowledge / are used. The 
first, 7 C0 , is represented by a prior plausibility distribu- 
tion 



p(p\ I co ) dp = g C o(p) dp oc dp, 



(2) 



which is constant in respect of the convex structure of 
the set of statistical operators, in the sense explained in 
§§ |]and^. The second, 7 ga , is represented by a spheri- 
cally symmetric, Gaussian-like prior distribution 



P(p\ 4a) d P = £ga(p) dp oc exp - 



tr[(p - |2)<2|) 2 ] 



dp, (3) 



centred on the statistical operator |2)(2|, one of the pro- 
jectors of the von Neumann measurement. This prior 
expresses some kind of knowledge that leads us to as- 
sign higher plausibility to regions in the vicinity of 
|2)(2|. 

To assign a statistical operator p DAl from these data and 
priors means to assign eight independent real coefficients of 
its matrix elements, or equivalently a vector of eight real pa- 
rameters bijectively associated with them. These parameters, 
according to eq. (|fb]), must be computed by the integration 
of a function (actually two, the other being a normalisation 
factor) defined over the set of all statistical operators. Hence 
the function itself and the integration region can be expressed 
in terms of eight coordinates, corresponding to the parame- 
ters. The coordinate system should be chosen in such a way 
that both the function and the integration limits have a not too 
complex form. For these reasons we choose the parametri- 
sation studied in particular by Kimura [[74]]. In this case the 
vectors of real parameters associated to a statistical operator 
is called a 'Bloch vector'. 

In such a coordinate system, six of the eight parameters can 
be calculated analytically and quite straightforwardly by sym- 
metry arguments, for all absolute-frequency triples N. The 
remaining two parameters have been numerically calculated 
for some triples N by a computer using quasi-Monte Carlo 
integration methods, suitable for high-dimensional problems. 
Further symmetry arguments yield the parameters for the re- 
maining triples. 

All these points as well as the results are discussed in the 
paper as follows: In § ^ we quickly present the reasoning lead- 
ing to the statistical-operator-assignment formulae ([!]), and 



particularise the latter to our study. In § || Kimura's parametri- 
sation and the Bloch-vector set are introduced. The two prior 
distributions adopted are discussed in § fj| The calculation, 
by symmetry arguments and by numerical integration, of the 
Bloch vectors and of the corresponding statistical operators 
is presented in § for all data and priors. In § || we offer 
some remarks on the incorporation into the formalism of un- 
certainties in the detection of outcomes. In § ^ we discuss the 
form the assigned statistical operator takes in the limit of a 
very large number of measurements. Finally, the last section 
summarises and discusses the main points and results. 



2. STATISTICAL-OPERATOR ASSIGNMENT 
2.1. General case 

This section provides a summary derivation of the formulae 
for statistical-operator assignment. For a more general deriva- 
tion of analogous formulae valid for any kind of system (clas- 
sical, quantum, or exotic), and for a discussion of some philo- 
sophical points involved, we refer the reader to g |^ [70| f7l]l 
and also [00. 

There is a preparation scheme that produces quantum sys- 
tems always in the same 'condition' — the same 'state'. We 
do not know which this condition is, amongst a set of possi- 
ble ones; 8 although there may be some conditions in that set 
that are more plausible than others. Our knowledge 7, in other 
words, is expressed by a plausibility distribution over these 
conditions. To each condition is associated a statistical oper- 
ator; this encodes the plausibility distributions that we assign 
for all possible quantum measurements, given that that partic- 
ular condition hold. Therefore we can and shall more simply 
speak in terms of statistical operators instead of the respective 
conditions. Note that this is, however, a metonymy, i.e. we 
are speaking about something ('statistical operator') although 
it is something else but related to it ('condition') that we really 
mean. 

We thus have a plausibility distribution over some statistical 
operators. It can in full generality be written as 



p(p\I)dp = g(p)dp, 



(4) 



defined over the whole set of statistical operators, denoted by 
8. The function g is a normalised positive generalised func- 
tion. 9 In this way the more general case is also accounted for 
in which the whole set of statistical operators § is involved: 
the case with a finite number of a priori possible statistical 



8 We intentionally use the vague term 'condition', since each researcher can 
understand it in terms of his or her favourite physical picture (internal mi- 
croscopic configurations, macroscopic procedures, pilot waves, propensi- 
ties, grounds for judgements of exchangeability, or whatnot). Quantum 
theory offers no concrete physical picture, only some constraints on how 
such a picture should work; so each one can provide one's favourite. 
See footnotes [f5]and|l6| 
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operators corresponds to a g equal to a sum of appropriately 
weighted Dirac deltas. 10 

Our 'prior' knowledge / about the preparation can be rep- 
resented by a unique statistical operator: Suppose we are to 
give the plausibility of the ^tth outcome of an arbitrary mea- 
surement, represented by the positive-operator-valued mea- 
sure {E^}, performed on a system produced according to the 
preparation. Quantum mechanics dictates the plausibilities 
p(Efi\ p) = tr(E M p), and by the rules of plausibility theory 
we assign, conditional on /, 



-[ 



p(E li \I)= p(E lx \p)p(p\I)dp = tv(E^p)g(p)dp, 



Is 



or more compactly, by linearity of the trace, 

p(E^\D = tx[E li fpg(p)dp], 
= tr(E M p 7 ), 

with the statistical operator p t defined as 



(5) 



(6) 



Pi ■= 



f pp(p\I)dp= f pg(p)dp. (7) 

Js Js 



The prior knowledge / can thus be compactly represented by, 
or "encoded in", the statistical operator p,. Note how p l ap- 
pears naturally, without the need to invoke decision-theoretics 
arguments and concepts, like cost functions etc. Note also that 
the association between / and p l is by construction valid for 
generic knowledge /, be it "prior" or not. 

The statistical operator p l is a "disposable" object. As soon 
as we know the outcome of a measurement on a system pro- 
duced according to our preparation, the plausibility distribu- 
tion p(p\ I) dp should be updated on the evidence of this new 
piece of data D, and thus we get a new statistical operator 
p lAD - And so on. It is a fundamental characteristic of plau- 
sibility theory that this update can indifferently be performed 
with a piece of data at a time or all at once. 

So suppose we come to know that N measurements, repre- 
sented by the N positive-operator- valued measures { E^ : /i = 
1, . . . , r/c}, k — 1, . . . , N, are or have been performed on N 
systems for which our knowledge / holds. Note that some, 



10 The knowledge / and all inferential steps to follow concern a preparation 
scheme in general and not specifically this or that system only; just like 
tastings of cakes made according to a given unknown recipe increase our 
knowledge of the recipe, not only of the cakes. If one insists in seeing the 
knowledge / and the various inferences as referring to a given set of, say, 
M systems only, then that knowledge is represented by a plausibility distri- 
bution over the statistical operators of these M systems, i.e. over the Carte- 



sian product S , and has the form pip 



(1) 



p (M)|/ )d p(l)... d p<M) 



g(p (1) )6(p <2) - p<'>)---5(p (M) - p <1) )dp <1) ---dp <M) . Integrations are 
then also to be understood accordingly. Note moreover that if we con- 
sider joint quantum measurements on all the systems together, then we are 
really dealing with one quantum system, not M. 
1 1 We do not explicitly write the prior knowledge / whenever the statistical 
operator appears on the conditional side of the plausibility; i.e., p(-\ p) := 

P(-\P,iy 



even all, of the measurements (and therefore their positive- 
operator-valued measures) can be identical. The outcomes 
i'i, . . ., ik, . . . , ijv are or were obtained; this is our new data 
D. The plausibility for this to occur, according to the prior 
knowledge /, is given by a generalisation of expression (^): 

p(D\I) = p(E ( i l\...,E™\I) = J [n P (Ef\p)] P (p\I)dp. 

(8) 

On the evidence of D we can update the prior plausibility dis- 
tribution p(p\ I) dp. By the rules of plausibility theory 



p(p\DM)dp = 



p(D\ p)p{p\I)dp 



j s p(D\ p)p(p\I)dp' 
[mtr(E^p)]g(p)dp (9) 
~ i[U k HEfp)]g(p)dp 

The statistical operator encoding the joint knowledge D A I 
is thus, according to eq. (0) and using eq. (Q), 



Pda 



:= f pp(p\D 



A 7)dp = 



J s p[rLtr(Efp)]g(p)dp 
JT[mtr(Efp)]s(p)dp 



(10) 



2.2. Three-level case 

So far everything has been quite general. Let us now con- 
sider the particular cases studied in this paper. 

The preparation scheme concerns three-level quantum sys- 
tems; the corresponding set of statistical operators will be de- 
noted by S3. The measurements considered here are all 
instances of the same measurement, namely a non-degenerate 
projection-valued measurement (often called 'von Neumann 



measurement'). Thus, for all k = 1, 



,N, IE 



{|1><1|, |2><21, |3><3|). The projectors |1)<1|, |2><2|, |3><3] define 
an orthonormal basis in Hilbert space. All relevant operators 
will, quite naturally and advantageously, be expressed in this 
basis. We have for example that tr(E M p) = p MM , the /ith di- 
agonal element of p. 

The data D consist in the set of outcomes {/[,..., iff} of the 
Af measurements, where each ^ is one of the three possible 
outcomes '1', '2', or '3'. The formula ( |To| ) for the statistical 
operator thus takes the form 



Pdm = 



4 p[nti Pki k ]g(P)dp 
/JnLp**U(p)dp ' 



(11) 



with/* e {1,2, 3} for all k. 

However, it is clear from the expressions in the integrals 
above that the exact order of the sequence of Ts, '2's, and '3's 
is unimportant; only the absolute frequencies (N],N2,N?,) of 
appearance of these three possible outcomes matter (naturally, 
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Nj > and = AO- We can thus rewrite the last equation 

as 



Pdm 



/s3p[nLp^]g(p)dp 
4[nLifl«(P)dp ' 



(12) 



with the convention, here and in the following, that p..' := 1 
whenever Nj - p„ = (the reason is that the product origi- 
nally is, to wit, restricted to the terms with 2V; > 0). 

The discussion of the explicit form of the prior g(p) dp is 
deferred to § Q We shall first introduce on S3 a suitable coor- 
dinate system (jti, . . . , x%) = x e R 8 so as to explicitly calcu- 
late the integrals. This is done in the next section. 



3. BLOCH VECTORS 

In order to calculate the integrals required in the state- 
assignment formula ( |l2| ) we put a suitable coordinate sys- 
tem on S3, so that they "translate" as integrals in R 8 . In 
differential-geometrical terms, we choose a particular chart on 
S 3 considered as a differentiable manifold pi M M M WA 



There exists an 'Euler angle' parametrisation [|100J |1 1 
102, 103] which maps S3 onto a rectangular region of R s 
(modulo identification of some points). With this parametri- 
sation the integration limits of our integrals become advanta- 
geously independent, but the integrands (p(D\ p) in particular) 
acquire too complex a form. 

For the latter reason we choose, instead, the parametrisation 



studied by Byrd, Slater and Khaneja [101 
(see also 



104], Kimura 



5J), and Boliikbagi and Dereli [105|, amongst oth- 
ers. The functions to be integrated take simple polynomials or 
exponentials forms. The integration limits are no longer inde- 
pendent, though — in fact, they are given in an implicit form 
and will be accounted for by multiplying the integrands by a 
characteristic function. 

We follow Kimura's study [ [74] ] here, departing from it on 
some definitions. All statistical operators of a af-level quantum 
syst e m ca n be written in the following form [(74)] (see also [ (75j 



101, 104]): 



1 I " 

p = p(x) = -Id + ^ Yj x ' A '' (xi,---,x n )=xeE n c R". 

(13) 



7=1 



a basis. The vector x = (xj) of coefficients in equation ( pj| ) is 
uniquely determined by p: 



Xj = Xj(p) = tr(Ajp). 



(14) 



The operators {Aj}, being Hermitean, can also be regarded 
as observables and then the equation above says that the (jc,0 
are the corresponding expectation values in the state p: xj = 



{Aj) P [106] 



A systematic construction of generators of SU(c/) which 
gene ralises the Pauli spin operators is known (see e.g. [ [74] , 
107]). In particular, for d — 2 they are the usual Pauli spin 



operators, and for d — 3 they are the Gell-Mann matrices 
(see e.g. Jl08|]). In the eigenbasis {|1)(1|, |2)(2|, |3)<3|) of the 



von Neumann measurement {E M } introduced in the previous 
section these matrices assume the particular form 



Ai = 



A 4 = 



A, = 



(0 1 0' 
10 0, 
0, 

(0 V 



1 0, 

Co 

-i 
i 0, 



ro 



A, = 



i 0^ 

i , 

0, 

fo ~f 



1 0, 



A3 = 



A fi = 



A 7 = 



A* = 



1 

v! 



n 





(l 0' 



-1, 

(0 o N 

1, 

1 0, 

0^ 





-2 
1J 



(15a) 

We see that our von Neumann measurement corresponds to 
the observable 



A 3 = |1><1| + 0|2><2|-|3><3|, 



(15b) 



the measurement outcomes being associated with the particu- 
lar values 1, 0, and -1. These eigenvalues, however, are of no 
importance to us (they will be more relevant in the companion 
paper @). 

For a three-level system, and in the eigenbasis 
{|1>(1|,|2)(2|,|3>(3|}, the operator p in ^ can thus be 
written in matrix form as: 



P = P(x) = 

+ ix 2 ) 
Ux A +ix 5 ) 



(xi - ix 2 ) 



V3 



j(x 6 + ix 7 ) 



\(x A - ix 5 ) 

j(X6 - 1*7) 

i + + -±x s )) 



(16) 



This matrix is Hermitean and has unit trace, so the remaining 
condition for it to be a statistical operator is that it be positive 
semi-definite (non-negative eigenvalues). This is equivalent to 



where n = d 2 - 1 is the dimension of S, and B„ is a com- 
pact convex subset of R". The operators {A ; } satisfy (1) 

Aj = A j7 (2) tvAj = 0, (3) tr(A,A ; ) = 26y. Together with two conditions pty for the coefficients x: with our definitions 
the identity operator Ij they are generators of SU(c/), and 
in respect of the Frobenius (Hilbert-Schmidt) inner product 



of the Gell-Mann matrices, the first is 



A/ ■ Aj := tr(A,A ; ) Q92Q they also constitute a complete or- 
thogonal basis for the vector space of Hermitean operators on 
a (i-dimensional Hilbert space. In fact, eq. ( pj| ) is simply the 
decomposition of the Hermitean operator p in terms of such 



x 2 < 

k=l x i 



(17a) 



which limits x to be inside or on a ball of radius 2/ V3; the 



7 



second is 



8 - 18x 2 + 27x 3 (x 2 + x\ - x\ - x 2 ) - 6 V3xg + 

9 V3x 8 [2(jt 2 + x\ + x\) - (x\ + x\ + x\ + + 

54(xiJC4X6 + xix\x~i + X2X5X6 — X1JC5X7) > 0. (17b) 



The set of all real vectors x satisfying conditions ( |17[ ) is called 
the 'Bloch-vector set' B 8 of the three-level system: 



5 8 .- 



[X 6 . 



I © hold). 



(18) 



Since there is a bijective correspondence between B 8 and S3, 
we can parametrise the set of all statistical operators S3 by the 
set of all Bloch vectors. 12 

Both S 3 and B 8 are convex sets P, [j^ fffi pj p9] 



HOt , |11 4 |112J , |113| , pL14fl , and the maps 
S3 — > Bg by pn 



x(p) 



given by ([l4]), and its inverse 



by 



p(x) 



(19) 



(20) 



given by ( pj[ ) or ( |T^ ) are convex isomorphisms, i.e. they pre- 
serve convex combinations: 

x(a'p' + a"p") = a'x(p') + a"x(p"), (21) 
p(a'x' + oc"x") = a' p(x') + oc" p{x"), (22) 

with a', a" > 0, a' + a" = 1. This fact will be relevant for 
the discussion of the prior distributions. 

It is useful to introduce the characteristic function x 1— > 
Xb(x) of the set B 8 : 



GO == 



|1 if* eB 8 , i.e. if (|17j) hold, 
if x gig, i.e. if (|I7|) do not hold, 



(23) 



and to consider the smallest eight-dimensional rectangular re- 
gion (or 'orthotope' [ 1 10|) Cg containing Bg. As shown in the 
appendix, Cg is 



(24) 



The relations amongst S3, Bg, and C 8 are schematically illus- 
trated in fig. ^. In fig. [l] we can see some three-dimensional 
sections (through the origin) of B 8 — and thus of S3 as well, 
in the sense of their isomorphism. 

We are almost ready to write the integrals of formula ( p"2| ) 
in coordinate form, i.e. as integrals over K 8 . It only remains 
to specify the volume element 13 dp in coordinate form. What 




Figure 2: Schematic illustration of the relations amongst ! 



, and 



we shall do is in fact the opposite: we define dp to be the 
volume element on S3 which in the coordinates x is simply 
d*. In differential-geometrical terms, dp is the pull-back [ p5[ 
H, |7L H Hi of dx induced by the map pni: 



dp h-> djc. 



(25) 



It is worth noting that this choice of volume element is not 
arbitrary, but rather quite natural. On any n-dimensional con- 
vex set S we can define a volume element which is canonical 
in respect of 5"s convex structure, as follows. Consider any 
convex isomorphism c : S — > B between S and some subset 
ficR". Consider the volume element on B defined by 



w 



dy/Jdy, 

B 



(26) 



where dy is the canonical volume element on R". The pull- 
back c*(w) of cv onto S then yields a volume element on the 
latter. It is easy to see that the volume element thus induced 

(1) does not depend on the particular isomorphism c (and set 
B) chosen, since all such isomorphisms are related by affine 
coordinate changes (y i-> Ay + b, with det A + 0, b e R' ! ); 

(2) is invariant in respect of convex automorphisms of S ; (3) 
assigns unit volume to S , as clear from eq. (p6b. These prop- 
erties make this volume element canonical. 1 "^ 

Since the parametrisation S3 — > B 8 is a convex isomor- 
phism, we see that dp as defined in (|25|) is the canonical vol- 
ume element of S3 in respect of its convex structure. 

We can finally write any integral over S3 in coordinate form. 
If p 1— > f(p) is an integrable (possibly vector-valued) function 
over S3, its integral becomes 



/(P)dp= f f[p( X )] X B(x)dx = 

J dxi - J dx 7 dx & f[p(x)]xa(x). (27) 



On some later occasions the terms 'statistical operators' and 'Bloch vec- 
tors' might be used interchangeably; but it should be clear from the context 
which one is really m eant 
13 An odd volume form [ U5][[)5| § IV.B.l] (see also Q ^]). Recall that a 
metric structure is not required, only a differentiable one. 



14 In measure-theoretic terms, we have the canonical measure B h* 
m[c(B)]/m[c(S ')], where B is a set of the appropriate cx-field of S and m 
is the Lebesgue measure on R", 
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This form is especially suited to numerical integration by 
computer and we shall use it hereafter. We can thus rewrite 
the state-assignment formula (O) for p DAl as: 



L p(x) nf=i Pa(x) N >] Xb(x) g(x) dx 
Pda/ = 8 - r L , ~j ■ (28) 

Expanding the p(x) inside the integrals using eq. ( |l3| ) (equiv- 
alent to dl6[)) we further obtain 



Lj(NJ) 



Pda/ 3 3 + 2 ^ Z(iV,/) 

7=1 



where 



Lj(N,r,:= f Xj \T\ Pu(x) N ']g(x) X B(x)dx, 

Jc 8 4=i 

for jf = 1, . . . , 8, and 

Z(N,r>:= f \U Pii(x) Ni ]g(x) X B(x)dx. 
Jc 8 L <=i J 



(29) 



(30a) 



(30b) 



We shall omit the argument '(N, /)' from both Lj and Z when 
it should be clear from the context. 

It is now time to discuss the prior plausibility distributions 
adopted in our study. 

4. PRIOR KNOWLEDGE 

The prior knowledge I about the preparation is expressed 
as a prior plausibility distribution p(p\I)dp = g(p)dp. 
The las t ex p ressi on can be in terpr eted, in me asur e -theo retic 
terms [|l6[ |l7]|||| § ID] [jll| (cf. also [[Tl9[ [l2(]]), as 
l \x(dpY , where ]i is a normalised measure; or it can be sim- 
ply interpreted, as we do here, as the product of a generalised 
function 15 g and the volume element dp. 16 The two points of 
view are not mutually exclusive of course, and these technical 
matters are only relatively important since S3 and the distri- 
butions we consider are quite well-behaved objects (and the 
simple Riemann integral suffices for our purposes). 

We shall specify the plausibility distributions on S3 giving 
them directly in coordinate form onB 8 (with an abuse of no- 
tation forg): 



p(x\ I) dx = g(x) dx := g[p(x)] dx. 



(31) 



121 



15 We always use the term 'generalised function' in the sense of Egorov [ 
whose theory is most g ener al and nearest to the physicis ts' ideas and prac 



ah 



tice. Cf 
berger [ 126 



127| 



Ligh thill [122|, Colombeau [123 



124 



125 1, and Oberguggen- 



16 It is always preferable to write not only the plausibility density, but the 
volume element as well. The combined expression is thus invariant under 
parameter changes; this also helps n ot to fall into some pitfalls such as 
those discussed by Softer and Lynch [ 128 1. 
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Figure 3: Graph of the constant prior's marginal density (x$, Xg) 
/g c0 dxi dx2 dx4 dxj dx(, Ax-j. The triangle represents the boundary of 
Bg in the Ox^xg plane (see § 5.2, and cf. figs. 



The first kind of prior knowledge considered in our study, 
7 co , has a constant density: 



p(x\ I co ) dx = g co (x) dx oc dx, 



(32) 



the proportionality constant being given by the inverse of the 
volume of B&. This distribution hence corresponds to the 
canonical volume element (or the canonical measure) dis- 
cussed in the previous section. Thus 7 co expresses some- 
how "vague" prior knowledge (although we do not necessar- 
ily maintain that it be "uninformative"). Fig. || shows the 
marginal density of the coordinates X3 and xg for this prior. 
The state-assignment formula which makes use of this prior 
assumes the simplified form 

JT p(x)[ul l p i i(x) N ]xB(x)dx 
Pdau = 8 ; r ; ~ • ( 33 ) 



f Cs [uLPu(x) N ]xB(x)dx 



The second prior to be considered expresses somehow bet- 
ter knowledge 7 ga of the possible preparation. In coordinate 
form it is represented by the spherically symmetric Gaussian- 
like distribution 



p(x\ 7 ga ) dx = g g!i (x) dx oc 

tr{[p(*) - p(x)] 2 ] 



exp- 



el* = exp 



(x-x) 2 



2s 2 



dx, (34) 



with 



x:= (0,0, 0,0, 0,0, 0,-2/ V3), i.e., p(x) = |2)(2|, 
= J_ (35) 
2V2' 

Regions in proximity of |2)(2| have greater plausibility, and 
the plausibility of other regions decreases as their "distance" 
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Figure 4: Graph of the Gaussian-like prior's marginal density 
(x3,x%) h» /g 5il dxi dx2 dx4 dx5 dx6 dxy. The triangle represents the 
boundary of Bg in the Ox^xg plane (see § 5.2, and cf. figs. 



{\i[p(x)-p(x)] 2 } 1 ^ 2 = \x—St\ from |2)(2] increases. The param- 
eter i may be called the 'breadth' of the Gaussian-like func- 
tion. 17 The marginal density of the coordinates X3 and xg for 
this prior is shown in fig. |] The state-assignment formula 
with the prior knowledge 7 ga assumes the form 

_ k PWjnL Pu(x) N ] exp[^]x B (*)d* 
j:. [nL P«(x) N '] exp[^] Xb(x) dx 

In the following the function g(x) will generically stand for 
g m (x) or gga(x). 



5. EXPLICIT CALCULATION OF THE ASSIGNED 
STATISTICAL OPERATOR 

We shall now calculate the statistical operator given by (|2^) , 
which means calculating the Lj and Z as given in (30a) 
and (|3(i|), for the triples of absolute frequencies 



N = 1 
N = 2 
N = 3 



N = 4,5,6,7: (0,N,Q), 



(1, 0, 0) and permutations thereof; 

(2, 0, 0), (1, 1,0), and permutations; 

(3, 0, 0), (2, 1, 0), (1, 1, 1), and permutations; 



"Standard deviation" would be an improper name, e.g., since s has not all 
the usual properties of a standard deviation. E.g., although the Hessian 
determinant of the Gaussian-like density vanishes for \x - x\ = s, the total 
plausibility within a distance s from x is 0.0047, not 0.00 175 as would 
be expected of an octavariate Gaussian distribution on R 8 [129]. This is 
simply due to the bounded ranges of the coordinates. 



with the prior distribution g co (x) dx; and the triples 

N = h (1,0,0), (0,1,0), (0,0,1); 

with the Gaussian-like prior distribution g gSL (x) dx. 

A combination of symmetries of Bg and numerical integra- 
tion is used to compute Lj and Z. 



5.1. Deduction of some Bloch-vector parameters for some data 
via symmetry arguments 

The coefficients Lj for j = 1,2,4,5,6,7 can be shown to 
vanish by symmetry arguments. Let us show that L\ — in 
particular. Consider 

U= f Jci[n Pii(xf-]g(x) X B(x)dx. (37) 
The transformation 



X = (x\, X2, X3, X4,X5,X(>,X7, Xg) I— > 

X' = {-X\,X2, X3, Xzt, X5, -X(>, Xl, -x%) (38) 

maps the domain C% bijectively onto itself, and the absolute 
value of its Jacobian determinant is equal to unity. Under this 
transformation we have that 



Xj = — x\, 
Paix') = p u (x) i= 1,2,3, 

g(x') = g(x) (for both g = g co , g ga ), 
Xb(x') = X (-*')• 



(39a) 
(39b) 
(39c) 
(39d) 



Applying the formula for the change of variables [13C, 131] 
to (|37|), using the symmetries above, and renaming dummy 
integration variables we obtain 



3 

x\ [fl Pu(x) Ni I g(x) Xb(x) dx, 

L !=l ' 



f xi \l\ Pii(x) Nl ]g(x) X3(x)dx, 
Jc 8 1=1 1 



Li = 0. 



(40) 



(41) 



Similarly one can show that L2, L4, L5, L(„ L7 are all zero 
by changing the signs of the triplets (x2,x^,xj), (xi,xs,X(,), 
(x2, X4, X(,), (x2, X4, xe,), (x2, X5, X7), respectively. 

The assigned statistical operator hence corresponds to the 
Bloch vector (0, 0, L 3 /Z, 0, 0, 0, 0, L 8 /Z), for all triples of ab- 
solute frequencies and both kinds of prior knowledge. I.e. it 
has, in the eigenbasis {|1)(1|, |2)(2|, |3)(3|), the diagonal matrix 
form 



Pdm - 



1 , L 3 +Z. s /V3 
3 T 2Z 







Viz 






-L,+Lg/V3 
2Z 



(42) 



10 



(note that L 3 $ and Z still depend on N and I). 

Two further changes of variables — both with unit Jacobian 
determinant and mapping Bg 1-1 onto itself — can be used 
to reduce the calculations for some absolute-frequency triples 
(N\,N2,Nj) to the calculation of other ones, with a reasoning 
similar to that of the preceding section. 

The first is 

X = (Xl,X2, X 3 , X4, X5, Xd, Xi, Xg) I — ^ 

x' = (x6, xj, —x 3 , Xa, —xs, xi , X2, xg), (43) 
under which, in particular, 

Pn(*') = P33W. P33U') = Pii(*)> PzjOO = Pnix). 

(44) 

From eqs. ( f30[ ) it follows that 

L 3 (N 3 ,N 2 ,Ni) = -L 3 (N U N 2 ,N 3 ), (45a) 
U(N 3 ,N 2 ,Ni) = L % (Ni,N 2 ,N 3 ), (45b) 
Z(N 3 , N 2 , Ni) = Z{N\ , N 2 , N 3 ), (45c) 

for both prior distributions g co and g ga . 

The second change of variables is an anti-clockwise rota- 
tion of the plane (x 3 ,xg) by an angle 2n/3 accompanied by 
permutations of the other coordinates: 



(X\, X 2 , X 3 , X4, X5,X(>,X7, Xg) I— > 



\X7,X 6 , 



x 3 + y/3xg 



, X 2 , X\,X4, X5, ■ 



vs. 



x 3 - xg 



, (46) 



2 2 
under which, in particular, 

Pn(*') = P22W. P22O') = P33W. P33CO = Pn(*)> 

(47) 

leading to 



L 3 [(N2,N 3 ,m),I co ] = --L 3 [(NuN 2 ,N 3 ),I C0 ] 



V3 

—Lg[(N u N 2 ,N 3 )J co ], 



(48a) 



V3, 



L8[(JV 2 ,JV3,JVi),/co] = — ^3 [(A^i , AT 2 , AT 3 ), 7 C0 ] - 

2 (48b) 



Z[(^V 2 ,^V3,^Vl),/co] = ZKNuNi.NJJa,]. 



(48c) 



Note that the formulae from this transformation holds only for 
the constant prior g co . 

From ( p3| ) we see that, for both priors, L 3 vanishes for all 
triples of the form (n,N - 2n, n) for some positive integer n < 
N/2, in particular for (Q,N,Q) and (n,n,n). In the last case 
Lg — as well — though only for the constant prior g co — , as 
can be deduced from (|3|) and (Q. 

In the case of the prior knowledge 7 co , it is easy to re- 
alise that, repeatedly applying the two transformations above, 
one can derive the values of L 3 , Lg, and Z for all triples 
(Ni , N2,N2) from the values for the triples with N 2 > N\ > N 3 
only. 



5.2. Numerical calculation for the remaining cases 

No other symmetry arguments seem available to derive L 3 , 
Lg, and Z for the remaining cases. In fact L 3 , Lg are in general 
non-zero (Z can never vanish, its integrand being positive and 
never identically naught). It is very difficult — impossible per- 
haps? — to calculate the corresponding integrals analytically 
because of the complicated shape of Bg . Therefore we have 
resorted to numerical integration, using the quasi-Monte Carlo 
integration algorithms provided by Mathematica 5. 2. 18 

The resulting Bloch vectors for the constant prior g co djc 
are shown for = 1,2,3 in figs. Q ^, and [7] respectively. 
We have included in fig. || the case N - — i.e., no data 
— corresponding to the statistical operator pj that encodes 
the prior knowledge I co . In fig. ^] we have plotted the Bloch 
vectors corresponding to triples of the form {N],N2,N 3 ) = 
(0,0, AO for AT = 1,...,7. 

The cases N = and N = I for the Gaussian-like prior 
g g a dx are shown in fig. ||. The case N = corresponds to the 
statistical operator Pj encoding the prior knowledge 7 ga . 

The large triangle in the figures is the two-dimensional sec- 
tion of the set Bg along the plane Ox 3 xg. It can, of course, 
also be considered as a section of the set of statistical op- 
erators §3. This section contains the eigenprojectors |1)(1|, 
|2)(2|, |3)(3|, which are the vertices of the triangle, as indi- 
cated. The assigned statistical operators, for all data and pri- 
ors considered in this study, also lie on this triangle since they 



are mixtures of the eigenprojectors, as we found in § 5.1 
eq. (p2|). They are represented by points labelled with the 
respective data triples. The points have planar coordinates 
(L 3 (N, I)/Z(N, /), Lg(N, I)/Z(N, /)). 

The numerical-integration uncertainties £3 and eg, for L3/Z 
and Lg/Z respectively, specified in the figures' legends, vary 
from +0.0025 for the triplets with N -2to +0.015 for various 
other triplets. Numerical integration has also been perfor med 
for those quantities that can be determined analytically (§ 5.1) 
— like £3(0, N, 0)/Z(0, N, 0) e.g. — , and the numerical results 
agree, within the uncertainties, with the analytical ones. 

A trade-off between, on the one hand, calculation time and, 
on the other, accuracy of the result was necessary. The ac- 
curacy parameters to be inputted onto the integration routine 
were determined by previous rough numerical estimations of 
the results; in some cases an iterative process of this kind 
was adopted. The calculation of the statistical operator for a 
given triple of absolute frequencies A^ took from three to one 
hundred minutes, depending on the accuracy required and the 
complexity of the integrands. 

The statistical operators encoding the various kinds of data 
and prior knowledge are given in explicit form in table [| 
Note that the uncertainties for the statistical operators should 
be written as £3^3/2 + egAg/2 (cf. eq. (|29|)); however, we 



' The programmes are available upon request. 



I m , N = (no data): 






'l/3 


1 




P(000),/„ ~~ 


1/3 









> 


1/3, 




NB: This statistical operator encodes the prior knowledge / co 



I m ,N= 1: 



'0.300 ± 0.001° 

P(oio)./„ = 0.399 ± 0.003° 

0.300 ± 0.001°; 

cases (100) and (001) obtained by permutation 
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/co, N = 5: 

f 0.215 ±0.004° 

P(050)A« = 0.571 ±0.009° 

0.215 ± 0.004° ) 

cases (500) and (005) obtained by permutation 



/co, N = 6: 



0.201 ± 0.004° 

Ppu,, = 0.598 ± 0.009° 

0.201 ±0.004° J 

cases (600) and (006) obtained by permutation 



/co, N = 2: 

(0.2735 ±0.0007" 

P(020),/ co = 0.453 ± 0.001° 

0.2735 ±0.0007° J 

0.3642 ± 0.0007° 

PCIMWoo = °- 272 ± - 001n 

0.3642 ±0.0007° J 
other cases obtained by permutation 



/co, N = 3: 



P(030),A 



P(021),/ C1 



0.249 ± 0.001° 

0.502 ± 0.003° 

0.249 ± 0.001°; 

0.333 ± 0.004° 

0.418 ± 0.003° 

0.249 ± 0.004°; 





'1/3 





' 


P(lll),/ co ~~ 





1/3 







V 





1/3, 



other cases obtained by permutation 



/co, N = 4: 



0.230 ± 0.004° 

P(040)./„ = 0.541 ±0.009° 

0.230 ± 0.004° ) 

cases (400) and (004) obtained by permutation 



/co, N = 7: 

f 0.191 ±0.004° 

P(070),/„ = 0.619 ±0.009° 

0.191 ±0.004°J 

cases (700) and (007) obtained by permutation 



N = (no data): 



0.195 ±0.004° 

P(ooo)./ ga = 0.609 ± 0.009° 

0.195 ±0.004°J 

NB: This statistical operator encodes the prior knowledge 7 ga 




0.180 ±0.004° 

0.640 ± 0.009° 

0.180 ±0.004°J 

0.239 ± 0.006° 

0.575 ± 0.009° 

0.186 ±0.006°; 

(0.186 ± 0.006° 

0.575 ± 0.009° 

0.239 ± 0.006°; 



"Note that only two of the three unc ertainties of the diagonal 
elements are independent; see § |5.2| . 

"This has been computed from the average of the cases (021) and 
(120) (appropriately permuted). 



Table I: Statistical operators assigned for the various absolute-frequency data and priors considered in this study. Cf. figs. 
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adopted a more compact notation in the table (see footnote a 
there). 

The results for N - 2 and N - 3 show an intriguing feature, 
immediately apparent in figs. ^ and [7| the computed Bloch 
vectors seem to maintain the convex structure of the respec- 
tive data. What we mean is the following. For given N, the 
set of possible triples of absolute frequencies (N\,N 2 ,N 3 ) has 
a natural convex structure with the extreme points {N, 0, 0), 
(0,N,0), and (0,0, AO: 

(N l ,N 2 ,N 3 ) = (f l N,f 2 N,f 3 N) = 

fi (N, 0, 0) + / 2 (0, N, 0) + / 3 (0, 0, AO, (49) 

where we have introduced the relative frequencies f := 
Ni/N. Denote the Bloch vector corresponding to the triple 



(N U N 2 ,N 3 ) = (Wf 2 N,f 3 N) by 

v(N u N 2 ,N 3 ) := 

(0, 0, L 3 {N, I„)/Z(N, /«,), 0, 0, 0, 0, U{N, I„)/Z(N, /„)). 

(50) 

These Bloch vectors (and hence the statistical operators) 
seem, from figs. || and [7| to respect the same convex com- 
binations as their respective triples: 

vC/iW, f 2 N, f 3 N) * fMN, 0, 0) + f 2 v(0, N, 0) + / 3 v(0, 0, AO. 

(51) 

In terms of the integrals ( |30| ) defining L 3 , L^, Z, and using ( [To] ) 
or (|42|), the seeming equation above becomes 



/i "iv +/2 ; -a +/3 -jv . ; = j,o. (52) 



a remarkable expression. Does it hold exactly? We have not 
tried to prove or disprove its analytical validity, but it surely 
deserves further investigation. [Post scriptum: Slater, us- 



ing cylindrical algebraic decomposition [132, 133, 134] and 
a parametrisation by Bloore [cf. 135], has confirmed that 
eq. ( |52| ) holds exactly. In fact, he has remarked that the some 
of the integrals, here numerically calculated, can be solved 
analytically by his approach.] 



6. TAKING ACCOUNT OF THE UNCERTAINTIES IN THE 
DETECTION OF OUTCOMES 

Uncertainties are normally to be found in one's measure- 
ment data, and need to be taken into account in the state- 
assignment procedure. For frequency data the uncertainty can 
stem from a combination of "over-counting", i.e. the regis- 
tration (because of background noise e.g.) of some events as 
outcomes when there are in fact none, and "under-counting", 
i.e. the failure (because of detector limitations, e.g.) to register 
some outcomes. 

Let us model the measurement-data uncertainty as follows, 
for definiteness. We say that the plausibility of registering the 
"event" T when the outcome 'ju.' is obtained is 

pen VA/) = fc(iin). (53) 

The event T belongs to some given set that may include 



such events as e.g. the 'null', no-detection event; the num- 
ber of events need not be the same as the number of out- 
comes. The model formalised in the equation above suffices 
in many cases. Other models could take into account, e.g. 
"non-local" or memory effects, so that the plausibility of an 
event could depend on a set of previous or simultaneous out- 
comes. We thus definitely enter the realm of communication 
theory [Jl36| [l37|, |l3| [l39|, Q (see also @ 

Given the preparation represented by the statistical operator 
p, and the positive-operator-valued measure {E M } represent- 
ing the measurement with outcomes the plausibility of 
registering the event 'V in a measurement instance is, by the 
rules of plausibility theory, 19 

P(i\ P) = Z^P(i\ P)p(p\ P) = £^01 P) ^(E^p). (54) 

This marginalisation could be carried over to the state- 
assignment formulae already discussed in § ||, and the for- 
mulae thus obtained would take into account the outcome- 
registration uncertainties. 

However, it is much simpler to introduce a new positive- 
operator-valued measure {A/} defined by 



(55) 



19 It is assumed that knowledge of the state is redundant in the plausibility 
assignment of the event T when the outcome is already known. 
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so that the plausibilities p(i\ p) in eq. (Q) can be written, by 
the linearity of the trace, 



p(i\ p) = tr(Aip). 



(56) 



In the state assignment we can simply use the new positive- 
operator-valued measure, which includes the outcome- 
registration uncertainties, in place of the old one. The last 
procedure is also more in the spirit of quantum mechanics: it 
is analogous to the use of the statistical operator p\p\ + P2P2 
when we are unsure (with plausibilities p\ and pi) about 
whether p x or p 2 holds. I.e., we can "mix" positive-operator- 
valued-measure elements just like we mix statistical oper- 
ators. In fact, we could even mix, with a similar proce- 
dure, whole positive-operator-valued measures — a procedure 
which would represent the fact that there are uncertainties in 
the identification not only of the outcomes, but of the whole 
measurement procedure as well. See Peres' partially related 
discussion [ 141 1. 



of measure zero). We shall see later what happens when such 
a region shrinks to a single point, i.e. when the uncertainties 
becomes smaller and smaller. In [pT| it is shown, using some 



theorems in Csiszar [ 142] and Csiszar and Shields [143], that 



p(p\D N A I) dp oc 



|0, if q(p) € Ooo, 

[p{p\I)dp, if q(p) e 

as N 



(59) 



7. LARGE-N LIMIT 



In other words: as the number of measurements becomes 
large, the plausibility of the statistical operators that encode 
a plausibility distribution not equal to one of the measured 
frequencies vanishes, so that the whole plausibility gets con- 
centrated on the statistical operators encoding plausibility dis- 
tributions equal to the possible frequencies. This is an intu- 
itively satisfying result. The data single out a set of statistical 
operators, and these are then given weight according to the 
prior p(p\ I) dp, specified by us. 

If (Pa, degenerates into a single frequency value /*, the ex- 
pression above becomes, as shown in [pjn, 



7.1. General case 



p[p\ (/ = D A I\ dp * p(p\ I) &[q(p) - /*] dp, (60) 



Let us briefly consider the case of data with very large N. 
We summarise some results obtained in [j7l|]. Mathematically 
we want to see what form the state-assignment formulae take 
in the limit N — > 00 . Consider a sequence of data sets \ Dn [ . 
Each Dm consists in some knowledge about the outcomes of N 
instances of the same measurement. The latter is represented 
by the positive-operator- valued measure {E,}. The plausibility 
distribution for the outcomes, given the preparation p, is 



qiP) := (<7/(p)) with qi (p) = tr(E,p). 



(57) 



Let us consider more precisely the general situation in 
which each data set Dn consists in the knowledge that the rel- 
ative frequencies / = (_/}) := (Ay AO lie in a region <X>n (with 
non-empty interior and whose boundary has measure zero in 
respect of the prior plausibility measure). Such kind of data 
arise when the registration of measurement outcomes is af- 
fected by uncertainties and is moreover "coarse-grained" for 
practical purposes, so that not precise frequencies are obtained 
but rather a region — like — of possible ones. 

For each data set we then have a resulting posterior distri- 
bution for the statistical operators, 



p(p\ D N A /) dp = p[p\ (/ e O n ) A /] dp = 

p(feO N \ p)p(p\I)dp 



J s p(fed> N \p)p(p\I)dp 



(58) 



and an associated statistical operator p DfjAl '■- J p p(p\ Av A 
I) dp. 

Assume that the sequence {([) N }™ =1 of such frequency re- 
gions converges (in a topological sense specified in [(7l|]) to a 
region <25 M (also with non-empty interior and with boundary 



which was also intuitively expected. 

Note that if the prior density vanishes for such statistical 
operators as are singled out by the data, then the equations 
above become meaningless (no normalisation is possible), re- 
vealing a contradiction between the prior knowledge and the 
measurement data. 



7.2. Present case 

In the case of our study, the derivation above shows that, as 
— > 00 and the triple of relative frequencies / = C/1,/2,/3) := 
(N\,N2,N-s)/N tends to some value /*, the diagonal elements 
of the assigned statistical operator p DfjAl tend to 



P('\ Pd n ai) = (Pd n ai)u -> f* as N 



(61) 



Combining this with the results of § 5.1 concerning the off- 
diagonal elements, we find that the assigned statistical opera- 
tor has in the limit the form 



7f 0' 
1 /; 







(62) 



for both studied priors. This is again an expected result. Only 
the diagonal elements of the statistical operator are affected 
by the data, and as the data amount increases it overwhelms 
the prior information affecting the diagonal elements. Both 
priors are moreover symmetric in respect of the off-diagonal 
elements, that get thus a vanishing average. 
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8. DISCUSSION AND CONCLUSIONS 

Bayesian quantum-state assignment techniques have been 
studied for some time now but, as far as we know, never been 
applied to the whole set of statistical operators of systems 
with more than two levels. And they have never been used 
for state assignment in real cases. In this study we have ap- 
plied such methods to a three-level system, showing that the 
numerical implementation is possible and simple in principle. 
This paper should therefore not only be of theoretical inter- 
est but also be of use to experimentalists involved in state es- 
timation. The time required to obtain the numerical results 
was relatively short in this three-level case, which involved 
an eight-dimensional integration. Application to higher-level 
systems should also be feasible, if one considers that integrals 
involving hundreds of dimensions are computed in financial, 
particle-physics, and im age-processing problem s (see e.g. the 



(somewhat dated) refs. [144, 145. 146, 147, 148]) 



Bayesian methods always take into account prior knowl- 
edge. We have given examples of state-assignment in the case 
of "vague" prior knowledge, as well as in the case of a kind 
of somehow better knowledge assigning higher plausibility to 
statistical operators in the vicinity of a given pure one. A com- 
parison of the resulting statistical operators for the same kind 
of data is quickly obtained by looking at figs. || and || (or at the 
respective statistical operators in table |). It is clear that when 
the available amount of data is small (as is the case in those 
figures, which concern data with no or only one measurement 
outcome), prior knowledge is very relevant. Any practised 
experimentalist usually has some kinds of prior knowledge in 
many experimental situations, which arise from past experi- 
ence with similar situations. With some practice in "trans- 
lating" these kinds of prior knowledge into distribution func- 
tions, one could employ small amounts of data in the most 
efficient way. 

The generalisation of the present study to data involving 
different kinds of measurement is straightforward. Of course, 
in the general case one has to numerically determine a greater 
number of parameters (the Lj) and therefore compute a greater 
number of integrals. It would also be interesting to look at the 
results for other kinds of priors, in particular "special" priors 
like the Bures one [JT^J, |l49[ [B^, [BJ [BjL We found a 
particular non-trivial numerical relation, eq. (|52[), between the 
results obtained for the constant prior; it would be interesting 
to know whether it holds exactly. 

In the next paper [ |83t we shall give examples of numeri- 
cal quantum-state assignment for data consisting in average 
values instead of absolute frequencies; and besides the two 
priors considered here we shall employ another prior studied 
by Slater [0. 



the kind staff of the KTH Biblioteket, the Forum biblioteket 
in particular, for their irreplaceable work. 

Post scriptum: We cordially thank Paul B. Slater for point- 
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tion, by which some of the integrals of this paper can be solved 
analytically, and for other important remarks. 



Appendix: DETERMINATION OF C 8 

Any hyperplane tangent to (supporting) a convex set must 
touch the latter on at least an extreme point [109, 1 ldj , 1 1 1 



113, 114, 153). To determine the hyper-sides of the minimal 
hyper-box Cg containing Bg we need therefore consider only 
the maximal points of the latter — i.e., the pure states. 

A generic ray of a three-dimensional complex Hilbert space 
can be written as 



|0> = a|l> + e^Z?|2) + e- iy c|3>, 



(A.l) 



with 



0<i3,y<27t, a,b,c>0, a 2 + b 2 + c 2 = 1; (A.2) 

note that any two of the parameters a, b, c can be chosen in- 
dependently in the range [0,1]. The corresponding pure sta- 
tistical operator is 



a 

ePab 
e iy ac e 



er^ab 

b 2 
KP-r) bc 



e-W-r>bc 

„2 



(A.3) 



All pur e sta tes have this form, with the parameters in the 
ranges (A^). Equating this expression with the one in terms 
of the Bloch-vector components (x,), eq. dlrj|), we obtain after 
some algebraic manipulation a parametric expression for the 
Bloch vectors of the pure states: 



xi = 2abcos /3, 

2 1 2 

X3 = a — b , 
X5 = lac sin y, 

x 7 = bcsini fl - y), 



X2 = lab sin [3 , 
X4 = lac cos y , 
X6 = bccos(j3 - y), 
x 8 = V3(Z? 2 - 1/3). 



(A.4) 



These parametric equations define the four-dimensional sub- 
set of the extreme points of Bg. It takes little effort to see that, 
as a, b, c, /3 , and y vary in the ranges ( |A.2| ), each of the first 
seven coordinates above ranges in the interval [—1,1] and the 
eighth in the interval [-2/ y3, 1 / V3]. The rectangular region 
given by the Cartesian product of these intervals is thus Cg as 
defined in eq. (^4j), q.e.d. 
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Figure 5: Bloch vectors of the assigned statistical operator for prior knowledge 7 C0 and absolute-frequency triples with N = and N = 1, 
computed by numerical integration. The large triangle in the figures is the two-dimensional section of the set Bg along the plane Ox^xg. The 
numerical-integration uncertainty in the x 3 and x 8 components is ±0.005. In the case of no data (A' = 0), the statistical operator assigned 
on the basis of the prior knowledge / co alone is the "completely mixed" one h/3. Note that that all the components of all four points have 
been determined by numerical integration, even those that can be exactly determined by symmetry arguments. Within the given uncertainties, 
numerical computations yielded the exact results. 
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Figure 6: Bloch vectors of the assigned statistical operator for prior knowledge / co and absolute-frequency triples with N = 2, computed by 
numerical integration. The large triangle in the figures is the two-dimensional section of the set Bg along the plane Ox^xg. The numerical- 
integration uncertainty in the x 3 and x 8 components is ±0.0025. Note that that all the components of all six points have been determined 
by numerical integration, even those that can be exactly determined by symmetry arguments. Within the given uncertainties, numerical 
computations yielded the exact results. 
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Figure 7: Bloch vectors of the assigned statistical operator for prior knowledge / co and absolute-frequency triples with N = 3, computed by 
numerical integration. The large triangle in the figures is the two-dimensional section of the set Bg along the plane Ox^xg . The numerical- 
integration uncertainty in the x 3 and x g components is ±0.005. Note that that all the components of all ten points have been determined 
by numerical integration, even those that can be exactly determined by symmetry arguments. Within the given uncertainties, numerical 
computations yielded the exact results. 
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Figure 8: Bloch vectors of the assigned statistical operator for prior knowledge / co and absolute-frequency triples of the form (0,N,0), with 
N = 1, 2, 3, 4, 5, 6, 7, computed by numerical integration. The large triangle in the figures is the two-dimensional section of the set Bg along 
the plane Ox 3 x$. The numerical-integration uncertainty in the x 3 and x & components is ±0.015. Only the x% component was determined by 
numerical integration; the x 3 vanishes for symmetry reasons. 
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Figure 9: Bloch vectors of the assigned statistical operator for prior knowledge / ga and absolute-frequency triples with N = and N = 1, 
computed by numerical integration. The large triangle in the figures is the two-dimensional section of the set Bg along the plane Ox^xg. The 
prior knowledge is represented by a Gaussian-like distribution of "breadth" s = 1 /(2 V(2) centred on the pure statistical operator |2}(2|; see §0. 
The small circular arc is the locus of the Bloch vectors (on the plane) at a distance \x - x\ = s from the vector x := (0, 0, 0, 0, 0, 0, 0, -2/ y3) 
corresponding to the statistical operator |2)(2|. In the case of no data (N = 0), the statistical operator assigned on the basis of the prior 
knowledge 7 sa alone lies in between the completely mixed one and the pure one |2)(2|. 



